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We introduce a class of birth-and-death Polya urns, which allow for both sampling 
and removal of observations governed by an auxiliary inhomogeneous Bernoulli 
process, and investigate the asymptotic behaviour of the induced allelic partitions. 
By exploiting some embedded models, we show that the asymptotic regimes exhibit 
a phase transition from partitions with almost surely infinitely many blocks and 
independent counts, to stationary partitions with a random number of blocks. The 
first regime corresponds to limits of Ewens-type partitions and includes a result of 
Arratia, Barbour and Tavare (1992) as a special case. We identify the invariant 
and reversible measure in the second regime, which preserves asymptotically the 
dependence between counts, and is shown to be a mixture of Ewens sampling 
formulas, with a tilted Negative Binomial mixing distribution on the sample size. 
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1 Introduction and outline of the results 


Polya urn schemes provide easily interpretable exchangeable sequences and are among the most 
celebrated sampling rules in probability. See Johnson and Kotz (1977) and Mahmoud (2009) for 
general treatments. Of particular interest for our purposes is the BlackwelRMacQueen Polya 
urn (Blackwell and MacQueen, 1973): given A > 0 and a nonatomic probability measure Pq on 
a Polish space X, a sequence sampled from a Polya urn is such that Xi ~ Pq and for re > 1 
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where 5x denotes a point mass at x. Since Pq has no atoms, if Xn+i is sampled from Pq a new 
value is observed, otherwise Xn+i is a copy of a previous observation. Hence a Polya urn sample 
will feature ties with positive probability, inducing a partition of the observed values. A popular 
interpretation of the above scheme is as a species sampling model (Pitman, 1996), whereby the 
observations label species sampled from a large population, and those drawn from Pq are species 
that have not been previously observed. 

The impact of the BlackwelRMacQueen Polya urn schemes and its developments has been 
extremely significant in applied probability and statistics, particularly through the construc¬ 
tion and characterisation of random probability measures via limits of exchangeable sequences 
(Blackwell and MacQueen, 1973; Pitman, 1995; 1996; 2006; Gnedin and Pitman, 2005; Lijoi, 
Mena and Priinster, 2005; 2007), and as a building block in the architecture of computational 
strategies for posterior inference with Bayesian nonparametric mixture models (Escobar and 
West, 1995; MacEachern and Muller, 1998; Neal, 2000; Ishwaran and James, 2001). 

In these respects, particularly relevant is its relationship with the Ewens sampling formula, 
which assigns probability 
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to vectors m = (rrii,... ,mn) € Z”, where !(•) is the indicator function. Originally intro¬ 
duced for describing the sampling distribution of allelic frequencies in a neutral population at 
equilibrium (Ewens, 1972), this provides the law of the “allelic” partition (mi,... ,mn) induced 
by a Polya urn sample of size n, where rrii is the number of alleles appearing exactly i times. 
Equivalently, it provides the law of the partition induced by sampling from a Dirichlet process 
random probability measure (Antoniak, 1974). See Crane (2016) (with discussion) for a recent 
review of applications and connections of the Ewens sampling formula. 

In this paper we consider a class of birth-and-death Polya urns (B&D-PUs for short), which in 
addition to adding observations according to (1) allow to remove observations from the current 
sample, and investigate their asymptotic regimes under a certain specification of the probability 
of a removal step. Rather than extending the predictive distribution in (I), we dehne these 
directly in terms of the dynamics induced on the associated allelic partition. For any /3 € (0,1], 
define a B&D-PU as the Markov chain M = {M(/i), h € Z_|_} with state space and transition 
probabilities 
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(3X, m' = m + ei, 

Pirui, m' = m — Si + Cj+i, f > 1 

(1 — I3)imi, m! = m — ei + ej_i, i > 1 


where oc denotes proportionality and 
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Figure 1: Graph representation of the partition-valued process induced by B&D- 
PUs restricted to (mi, m 2 ) (left), and probabilities of admissible transi¬ 
tions, up to proportionality (right). 


Note that the normalising constant in (3) is (dX -\- Without loss of generality, we let 

for convenience M and other auxiliary chains introduced later start from the origin (0,0,...), 
instead of assigning them an initial distribution. The transitions in (3) correspond respectively 
to the introduction of a new block of size one; to a block of size i becoming of size i + 1; and to 
a block of size i becoming of size i — 1. Here inii is the total number of items in blocks of size 
i. The dynamics of M, restricted to its first two coordinates, are depicted on a lattice in Figure 
1. Equivalently, the above transitions can be expressed in terms of the underlying process of 
observations, whereby with probability 
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a further observation is drawn from (1) and added to the sample, and with probability 1 — 
6(/3, m) an observation is chosen uniformly from the current sample and removed. Setting 
/3 = 1 above reduces (3) to the usual dynamics induced on partitions by the Polya urn (1) (cf., 
e.g., Feng (2010), Section 2.7.2), whereas /3 = 0 would simply remove sequentially all items 
currently available until none is left, hence it is not considered here. Note that the random 
allelic partitions induced by Polya urns are consistent under uniform deletion, i.e., the partition 
obtained by removing a uniformly chosen item from an ESF„-distributed partition of n elements 
has distribution ESE„_i; cf., e.g., Crane (2016). Hence perturbing the Polya urn dynamics by a 
finite number of uniform removals does not effect its limiting behaviour. Here, however, we are 
allowing for an infinite number of removals according to an auxiliary inhomogeneous Bernoulli 
process with state-dependent probability 1 — 6(/3, m), and study the implied long run behaviour. 

We show that B&D-PUs exhibit asymptotically a phase transition at /3 = 1/2 from stationary 
dynamics with a random number of blocks k = Ei to nonstationary dynamics which yield 
almost surely infinitely-many blocks. To this aim, we first study the stationary properties of 
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an auxiliary system of finite-dimensional Markov chains on the space of partitions. These differ 
from other commonly used partition-valued processes, e.g., those in Petrov (2009), which are 
indexed by a fixed sample size, or those in Crane (2014), which are indexed by the maximum 
number of blocks. Our auxiliary processes are instead indexed by a maximal allelic count, i.e., 
the maximum number of items allowed in each block, whereas both the number of blocks and the 
sample size are left free to vary. The ergodicity of these chains is then exploited by embedding 
them, at certain stopping times, in B&D-PUs, whose asymptotic distributions coincide with the 
weak limits of the stationary laws of the auxiliary chains. Specifically, values of /3 G [1/2,1] for 
B&D-PUs in the long run generate infinite structures analogous to those induced by limits of 
Ewens partitions. In this limit, the allelic counts (mi, m 2 ,...) are asymptotically independent 
with Poisson distribution of mean A/z, irrespective of the value of (3. This result for /3 = 1 was 
first proved by Arratia, Barbour and Tavare (1992). Values of /3 G (0,1/2) are instead shown to 
generate stationary models on with invariant and reversible distribution 
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where Po(-;0) is a Poisson probability mass function with mean 9. Here several elements are 
of interest. The first is the dependence between the allelic counts m*, which, contrary to the 
/3 > 1/2 case, is preserved in the limit. An interpretation for this dependence can be given 
by taking m* be independent Poisson variables of mean 9i/i, with 9i as in (5), and letting 
J = j with probability proportional to 9j for j > 1 or to /3A for j = 0. Then the vector 
(mi,.. ., mj-i,mj + 1, rrij+i, • • •,) has distribution (5). The second is the fact that the expected 
number of items E(zmj) in groups of size i can be easily checked to be proportional to 9i 
and thus depends on i, whereas in the Ewens case E(zmj) = A in the limit. Recently, Betz 
and Ueltschi (2011); Betz, Ueltschi and Velenik (2011) studied the asymptotic behaviour of 
generalised sampling formulas where the counts distribution also depends on the count index 
i, obtained by replacing A in (2) with A* for a sequence (Ai,A 2 ,...) of nonnegative reals. A 
third element of interest is the fact that the number of groups at stationarity is random and 
finite, with distribution determined by (/3, A); see (20) below. Gnedin (2010) studied a partition 
structure generated sequentially which yields a finite, random number of blocks. Einally, in this 
stationary regime, the underlying system of particles is also stationary with invariant measure 
related to a mixture of Polya urn schemes, with a tilted Negative Binomial mixing distribution 
on the sample size. The latter is defined here as the total number of items in the system at a 
given time, i.e., n := 

The paper is organised as follows. Section 2 defines the auxiliary system of Markov chains 
with maximal allelic counts and identihes their invariant measure. Section 3 proves the phase 
transition for the partition structures generated asymptotically by B&D-PUs, identifies their 
limiting distributions and shows the reversibility for /3 G (0,1/2). Einally, Section 4 highlights 
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a connection of our results with a mixture of Ewens sampling formulas. 


2 Chains with maximal allelic count 


In this Section we define and study a system of finite-dimensional partition-valued Markov 
chains, which are instrumental for the investigation of the B&D-PUs asymptotic regimes. Fix 
L G N and > 0, and define Ml = € Z+j to be the Z:^-valued Markov chain with 

transition probabilities 
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with normalising constant fdX -|- (1 — /3)/U -|- \m\, for |m| = irui. Here the difference with 

respect to (3) is that the system has a maximal allelic count niL, whereby a group of size L 
which becomes of size L-|-1 is removed from the system, with probability proportional to PLuil, 
and groups of size L can be inserted in the system, with probability proportional to (1 — 

The following result identifies the invariant measure of Ml- 


Theorem 2.1. Ml with transition probabilities (6) has unique invariant distribution 


(7) 
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The global balance condition reads 
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: /3 ^7ri(m' - ei)A + ^ 7rL(m' + e* - ei+i)i(m- + 1) + 7rL("i' + eL)L{m'i^ + 1)^ 

+ {l- (5) f^L(m' + ei)(m[ + 1) + ^7rL(m' + 6— + 1) + ^Lim' - eL)fJ. 


i=2 
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The right hand side equals TTLim'){l3X + (1 — /3)fJ. + \m'\) upon imposing 

j3\ + (1 — (3)92 = 01) 

^ I39i-i + (1 — /3)0i+i = 0i, i = 2,..., L — 1, 
/30l_i + (1 - (3)^1 = 9l, 

(39l + (1 ~ f3)9i = (3X + (1 — (3)^. 


The last equation equals the sum of the first L, hence (0i,..., 9^) is the solution of the system 
of L linear equations 


' 01 - (1 - /3)02 = /3A 
< 0i -/30i_i - (1 -/3)0*+i = 0, f = 

_ 9l - (39l-i = (1 - (3)^. 

Lemma A.l in the Appendix now implies that the solution of the system is given by (8), and 
the statement follows by dividing both sides of (10) by the normalising constant 

L 

{(3X + (1 - /3)/i + |m|)7fL(m) = (3X + {1- (3)fj. + ^ 9i. 

*=1 

Finally, uniqueness follows from positive recurrence, which can be easily proved. □ 


An interpretation for the dependence among the counts m* at stationarity can be provided 
by means of an alternative representation for the invariant distribution. Let rrii be mutually 
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independent each with Po{mi]6i/i) distribution, 9i as in (8), and select J = j with probability 
proportional to 6j for j = 1,..., L and to /3A+(1—/3)/x for j = 0. Then m' = (mi,..., m^-i, m^T 
1, mj+i,... ,mL) has distribution (7). This can be easily shown by exploiting the fact that if Z 
has Po(z; A) distribution, then Z + 1 has probability mass function Po{z',\)z /An informal 
interpretation for the component-specific parameters 6iS in (8) can be provided by recalling that 
new groups enter the system from the left (i.e., m m -\- ei) with probability proportional to 
A and from the right (i.e., m-\- ei) with probability proportional to /r. It is easily checked 

that Wi{l3) is decreasing in i. Then (8) expresses the fact that the effect of A (resp. fi) on m* is 
stronger for small (resp. large) i. For odd L, the median parameter 9(L-ei)/2 simplifies to 

p{L+l)/2 

^(L-|-1)/2(/5) — ^(L-|-l)/2 + (1 — l]'jiL-el)/2 ’ ^ ^ 

which shows more explicitly the effect of fd on the median count and its dependence on the 
number of counts separating it from the extremal mi and mi (see the proof of Lemma A.l). 
When /3 = 1/2, (11) further simplifies to 1/2, and 0 (l+i)/ 2 reduces to (A -|- //)/2. 


3 Birth-and-death Polya urns 

B&D-PUs have been defined in the Introduction to be partition-valued Markov chains M with 
state space and transition probabilities (3). Informally, the underlying sampling process 
can be thought of as Polya urn sampling where particles are deleted at random times. Here we 
exploit the class of chains with maximal allelic count Ml, introduced in the previous section, 
for identifying the asymptotic regimes of M. The strategy is to let a sequence of chains with 
maximal capaciy {Ml,L € A^} converge to the B&D-PU M as L ^ oo, and then obtain the 
asymptotic regimes as appropriate limits of the marginal distributions of Ml- We achieve this 
by letting the probability of introducing L-sized blocks in be governed hy p-L instead of /r 
(cf. (6)), and letting p-L converge to zero appropriately fast as L ^ oo. The key intuition here 
is that since the probability of blocks entering the system for left and right is proportional to 
f3\ and (1 — ld)pL respectively, the expected number of items entering the system from left and 
right is proportional to /3A and (1 — (3)LpL respectively, so the second term needs to go to zero 
appropriately fast, as L —>■ oo, in order to obtain asymptotically well-defined dynamics. 

First we identify, with the following result, the weak limits of the invariant distribution of 
AIl, determined in Theorem 2.1, as pL goes to 0 with L. 

Theorem 3.1. Let {pl}l>i C be decreasing and such that LpL — 0 as L —oo, and let 
{z[^\ ... ,Z^'^) have distribution (7), with 6i as in (8) replacing p with pL- Then, as L ^ oo, 

(zi^\zf\...)4(Zi,Z2,...), 


where (Zi, Z 2 ,...): 
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(i) are independent Poisson random variables with mean Xji, if jS & [1/2,1]; 

(a) have joint distribution (5), if fd ^ (0,1/2). 

Proof. Let Wi{/d) be as in (8). As L —>■ oo, it is easy to see that, for any i, Wi{j3) —>■ (/3/(l — /3))* 
when fi < 1/2 and Wi{fd) —1 when /? > 1/2. Also, let be as in (8) with /x = pLi. Since 
^ 0 as L —> oo, (/^/(l “ /3))*A when fi < 1/2 and —?> A when (3 > 1/2. Then, by 

using (27) of Lemma A.l in the Appendix, as L —>■ oo we have 



2 = 1 


AT + 3^(A - hL) + oM, if /3 > 1/2. 


Thus LfiL —>■ 0 implies that converges to A/3/(l —2/3) when /3 < 1/2 since, and diverges 


for /3 > 1/2. 


Denote now by and Eoo the expectations with respect to ttl in (7) and in (9), 

respectively, with parameters in ttl and ttl, and 6i = (/3/(l — /?))*A for jd < 1/2 and 9i = X 
for /3 > 1/2 in tToo- For any sequence {cji} such that 
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where the second and third equalities follow from 


E(e = exp{—0(1 — e ‘^)} 

E(Xe-‘^^) = - 4^E(e-'^^) = ^e”"^ exp{-0(l - e"'^)}, 
d(p 


for X ~ Po(x; 6). Here 


as long as 


^lim El = Eoo (« 




L ^(L) 


lim ^^(1 

L—>oo ^^ i 
i=l 


— e = 


Ef(' 


— e 


i>l 


The latter is in turn implied by 0^^^ —?> 0* and an application of the monotone convergence 
theorem, since < 0^-^^ for L large when ^ll decreases to 0; see Lemma A.2 in the Appendix. 

As for the hrst factor on the right hand side of (12), an application of Cesaro’s Theorem, together 
with the fact that (3 > 1/2, —>■ 0, yields 

,i,„ /?A + (l-/3)ML + Ef.i0L«!-^- ^ 1 
P\+{l-l3)liL + Eti4‘‘’ 

When (3 < 1/2 and Wuil^oo LpL = 0, 

/3A + (1 - /3)ml + Ef=i _ /3A + Ei>i 


(13) 


lim 


/3A + (1 - /3)/rL + E*ii Of'^ “ 2/3) ’ 

where at the numerator we have applied the monotone convergence theorem. Noting that 

/3A + Ei>i ^*6 


-En 


j Ei>l 


/3A +/3A/(1 - 2/3) 

corresponds to the Laplace transform of m under the distribution (5) completes the proof. □ 


Note that, when /3 > 1/2, the weaker assumption that fiL ^ d suffices for the above result. 
This is informally due to the fact that /3 > 1/2 makes the addition of size-1 blocks to the system 
frequent enough to counterbalance the frequency of L-sized blocks entering the system from the 
right when hl —>■ 0, instead of LfiL —>■ 0. When /3 < 1/2, this is not the case and fj,L must go to 
zero faster than 1/L. 

Next, we exploit an embedding of Ml in M at appropriate stopping times, in order to show 
that the weak limits in Theorem 3.1 describe the long-time behaviour of the B&D-PU. Note 
that, when /3 > 1/2, the result extends Theorem 1 in Arratia, Barbour and Tavare (1992). 
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Theorem 3.2. Let M be a B&D-PU with transitions (3), and let Zj{h) its j-th component. 
Then, as h ^ oo, 

{Zi{h),Z2{h),...)^{Zi,Z2,...) 

where (Zi, Z 2 , 

fij have joint distribution (5), for (3 G (0,1/2); 

(ii) are independent Poisson random variables with mean X/i, for /? € [1/2,1]. 

Proof. Let pi '. —>■ Z/l, defined as 

(14) pL{m) = (mi,... ,mL), 


be the restriction of m G to its first L components. We show that, for any G N, 
PN{M{h)) {Zi,..., Z]\j) as h ^ 00 in the two regimes. To this end, dehne the auxiliary 
chains Mr = {ML{h),h G on with transition probabilities 


(15) 


/3A, m' = m + ei. 


PL{m'\m) oc 


ftinii, m' = m — ei + e^+i, i > 1, 

(l-/3)imi, m'= m - ei +Ci-i, i^L + l, 


(l-/3)/rL, m'= m - eL+i +ei- 

\ 


and normalising constant /3A + (1 — f3)p,L + |m| — (1 — f3){L + l)mL+i. Here Ml differs from a 
B&D-PU in that transitions m m — cl+i + e^, whereby one item from an {L + l)-sized group 
is removed, have probability proportional to (1 — f3)pL instead of (1 — /3)(L + l)mL+i. Note 
that count mi and nii+i remain dependent, in view of the transition m i-A m — e^ + e^+i. Let 
{pl}l>i be a decreasing sequence such that, as L — 00 , LpL 0. Let 


(16) 


cjfc = min{/i > Gk-i : pl(Ml(/i)) 7 ^ PL{ML{(yk-i))], 


to be the /cth time a transition of Ml involves the hrst L counts. Proposition A.3 in the Appendix 
shows that Ml is embedded in Ml at the stopping times cr^, in that {(Tk,n > 1 } occur infinitely 
often and 

P {pL{ML{(Tk)) = m' pL{ML{(Jk-i)) =m^ = pL{m'\m), 
with pl as in ( 6 ) with p, = pL- Together with Theorem 2.1, this implies 


PL 




4(zf\...,zf)), 
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where the right hand side has distribution ttl with fj, = fiL, see (7). Clearly, this also implies 
(17) pN (^ML{ak)^ A asn-^cx) 


for any N < L. Emphasising the dependence on L in (16) by o'^\ note now that {a^^\n > 
1} C {a^n^^\n > 1} and > 1} f hi as L ^ oo with probability one, since for all /i € N 

there exists an Lq such that {1,..., h} C {a^\n >1} for all L > Lq. Therefore, for any given 
h, 


Pn (^ML{h)j 


Pn 



)) 


for L sufficiently large. The result now follows by taking the limit for L —oo on both sides of 
(17), in virtue of Theorem 3.1. □ 


The asymptotic regimes of B&D-PUs are thus determined by the probability of introducing 
new singleton blocks into the system. These produce for /? > 1/2 infinite partitions analogous to 
those induced by Polya urns, since insertion of singletons are frequent enough to make deletions 
asymptotically irrelevant, and the counts m* become independent in the limit. When /3 < 1/2, 
instead, the stream of incoming items is not frequent enough and the dependence between the 
counts nii is retained in the limit with distribution (5). The next result shows that this latter 
case provides the reversible and invariant distribution of B&D-PU with /3 € (0,1/2). 


Theorem 3.3. Let M have transitions as in (3) with j3 G (0,1/2). Then (5) is the reversible 
and invariant measure of M. 

Proof. Let 7foo(m,) be as in (9) and C~^ = /3A + /?A/(1 — 2/3) be the normalizing constant 
appearing in (5). Then, for any m G Z“, we have 


7 r(m + ei)p{m | m + ei) = Citooim + ei)(l - /3)(mi + 1) 


= CTroo{m) 


01 


mi + 1 


(1 — /3)(mi + 1) = CTtoo{m)f3X = 7r{m)p{m + ei | m) 


and for any i > 1 


n{m - Ci + ei+i)p{m \ m - a + ej+i) = CiTodm - + ei+i)(l - /3)(z + l)(mi+i + 1) 

, imi 9i+i 


= CTToo{m)- 


6i (f + l)(mj+i + 1) 

= CTTooim)f3imi = 7r{m)p{m — e* + Cj+i | m) 


(1 - P){i + l)(mi+i + 1) 


yielding the result. Finally, in view of the positive recurrence of the chain, which can be easily 
proved, 7r(m) is also the unique invariant measure of M. □ 
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With a similar argument to that used in Section 2 for the chains with maximal allelic count, the 
invariant distribution of B&D-PU’s admits representation as augmented vector of independent 
Poisson variables. Recalling that 9i = {j3/{1 — /3))*A, define J = j with probability proportional 
to Oj for j > 1 and proportional to /3A for j = 0. One can easily check, using the geometric 
series X]i>i P* = p/(l — p) for 0 < p < 1, that 


(18) P(J = 0) = i(l-:^), 


P(J = j) 


2f3\ ’ 


j > 1- 


This can be obtained from a geometric distribution of parameter 1 — /3/(l — /?), by reallocating 
half of the mass assigned to 0 to the other support points, resulting in a tilting given by the 
factor 1/2/3. Next, if m = (mi,m 2 , ■ ■.) where the m/s are independent each with Fo{mi;6i/i) 
distribution, then 


(19) m + ej ~ 7r(m) 

for 7r(m) in (5). Note also that in the stationary regime, the random partition induced by B&D- 
PUs has a random number of groups K := contrary to the number of groups induced 

by usual Polya urns, recovered here for /3 > 1/2, which grows to infinity asymptotically as log/i 
(Korwar and Hollander, 1973). Exploiting the representation (18)-(19) and using the fact that 
= “log(l — p) for 0 < p < 1, one finds that, when /3 < 1/2, 

(20) K = Koc> + l{J>l), Koo~Pois(/c;-Alog(l-/3/(l-/3))) 

where Koo corresponds to the sum of independent Poisson random variableis with parameters 
9i as in the l.h.s. of (19). This immediately yields the moments of K, for example 

E{K) = P(J = 0) - Alog(l - (3/(1 - /3)). 


4 Connection with a mixture of Ewens sampling formulas 

We conclude by showing that the invariant measure of B&D-PUs corresponds to a mixture of 
Ewens sampling formulas ESE„ in (2) with a tilted Negative Binomial mixing measure on the 
sample size n. Let 

(21) NB(n; r,p) = ^^^^^p”(l - p)^ n = 0,1,... 

nil (r) 

be the Negative Binomial distribution with parameters r > 0 and p G (0,1). 

Theorem 4.1. Let /3 G (0,1/2) and M he a B&D-PU with transition probabilities (3) and 
invariant distribution ir as in (5). Then 
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(^) N ■■= E^>l imi is a birth-and-death chain with invariant distribution 

/i(n) oc (/3A + n) NB(n; A, /3/(l — /3)), n = 0,1,... ; 


(a) IT admits representation as mixture of Ewens sampling formulas 


(22) 7r{m) = '^ESFnipnim))p{n), m G , 

n>0 

with pn as in (14). 

Note that (22) is well defined for n = 0 if one interprets (2) as giving probability one to the 
empty vector po{m) = 0 , as one can easily verify that '/r( 0 , 0 , • ■ •) = p{0)- 


Proof From (3), it is easily seen that is a birth-and-death chain with immigration, whose 
transitions n —>■ n ± 1 have probabilities t(n ± l|n) proportional to /3(A + n) and (1 — /3)n 
respectively. The expected increment of N is proportional to /3A — (1 — 2/3)n, which is always 
positive for /? > 1/2, yielding non stationarity. For /? < 1/2, using the fact that the expected 
value of the Negative Binomial distribution in (21) is pr/(l — p), it is readily verified that 

= /3A + n A(,)/ /? y / _ fd 

213X n! Vl-/3y V 1-/3/ 


where 0 ( 3 .) = a{a + 1) • • • (a + x — 1) = r(a + x)/r(a) is the Pochhammer symbol. The detailed 
balance condition for N then reads 


(24) 


p{n 


1 ) t{n\n — 1 ) 


/3A + n - 1 A(„_i) / /3 
2 )dA (n- 1 )! 



pX + n Xin)( P y/ _ 13 y+^ (l-/3)n 

2(3X n\ \1 — (3 J \ 1 — (3 ) /3A + n 


/3 y+^ /3(A + n-l) 
1 — (3 ) (3X-\- n — 1 

p{n)t{n — l|n). 


hence p, is the reversible measure for N, yielding the first assertion. To prove (22), it suffices to 
show that 7 r(m) = ESF,i(p,j(m))/x(n) whenever m G IPfd is such that ~ 1°^ ESFj^ 

and p{n) as in ( 2 ) and (23). Let p = (3/{l — (3) and C~^ = /3A+/3A/(1 —2/3) = 2/3A/(l —/3/(l —/3)) 
be the normalising constant appearing both in (5) and in the first display of (23). Assuming 
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that have 


7r(m) = C(^/3A + imj JJ 


P-VA 


i>l 


rujl \ i 


= C{px + n)p^ exp <1 - A — [ JJ 


i>i ^ i=i 


mi 


mA 


= C(/3A + n) 


/3 


1-/3 


1 - 


/3 


1-/3 


A n 


n(-l '— 

-*■-*■ \ i / rn,-! 


2 = 1 


where in the third equality we have used the fact that = “log(l ~ p)- Multiplying 

and dividing by A(„)/n! now gives the desired condition. □ 


For what concerns the sampling process associated to B&D-PUs, say X, which alternates 
sampling of observations from (1) to removals of uniformly chosen observations, this evolves in 
F/ = 0 U (U„>iX”) according to the following transition probabilities 


(25) 


q{x'\x) oc < 


/3A, 

13, 

(1-/3), 


ifx' = {xi,...,XN,y), y~Ro, 

if x' = (xi, ...,Xj,.. .,XN,Xj), l<j<N, 

if x' = (xi, .. .,Xj-i,Xj+i,.. .,xn), 1 < j < 3V, 


with normalising constant /3A + N. It is immediate from Theorem 4.1 to see that this particle 
process is also stationary when /3 < 1/2 with invariant measure given by the mixture of Polya 
urn schemes 

PU(dx) := ^PUn(dx)^(n), 

n>0 

where fi is as in Theorem 4.1, PUn(dx) represents the joint law of Xi,..., Xn drawn from the 
Polya urn scheme (1), conditional on N = n, and PUq assigns probability one to the empty set. 
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Appendix 


Lemma A.l. Let A,/U > 0. Then the system 
' 01 -{ 1 - 13)62 = fdX 

< 0 * - /30,_i - (1 -/3)0i+i = 0 , i = 2 ,...,L-l, 
6 l - (dOi-i = (1 - /3)/i. 

\ 


has solution 


(26) 6, = < 


Moreover 


(1 - ^ (i-ffl-y 


L — i 3- 1 i 

■A + ^ , At, 


L + 1 


L + r ’ 


/? = 1/2. 


(27) ^0, 

2=1 


= < 


(1_/3)L+i^_/3L+iA_ (i-/3)/3 (i-/3)^-/3i ^ ^ ^ ^ 

_ P)L+i _ pL+i i_2I3 (1 _ ; 3 )L+i _/3L+1 

^L, p = 1 / 2 . 


Proof. The system can be written Ax = (3, where /3 = j3\ei + (1 — l3)fJ.eL and A is a tridiagonal 
matrix with entries —(1 — /3), 1,-/3 respectively above, on and below the main diagonal. Let 
dL = det(A) where subscript L refers to the dimension of A. Then di, 1 < I < L, corresponds 
to the determinant of the I x I submatrix made of the first I rows and columns of A, i.e. det{A) 
when L = 1. By using expansion by the first column, it is easy to check that di satisfies the 
recursive relation 


(28) di = di_i -/3{1 -/3)di_2, l>2 

with initial values di = 1,^2 = 1“/5(1—/3). Consider the associated second order difference 
equation 

(29) xt +2 - xt+i + /3(1 - /3)xt, t>2 

i.e. (28) with dt = xt-i- It has characteristic equation — m + /3(1 —/3) = 0 with roots mi = b 
and m 2 = (1 — /3). When /3 7 ^ 1/2, (29) has general solution given by 


Xt = Ci/3‘ + C2(l -/3)*, Ci,C2€M 
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so that by using the initial conditions xq = 0 and xi = 1 — /3(1 — /?), we find C 2 = (1 — /3)^/(l — 
2/3), Cl = 1 — C 2 = —/3^/(l — 2/3) and, in turn, 

(30) d; = ((l-/3)'+i-/3'+i)/(l-2/3) 

When /3 = 1/2, (29) has general solution given by Xt = (ci + C 2 t)(l/ 2 )*, ci,C 2 € M where 1/2 
is the common root of the characteristic equation. By using the initial conditions xq = 0 and 
xi = 1 — /3(1 — /3) = 1 — 1/4, we find ci = 1; C 2 = 1/2, therefore 

(31) d; = (l + 2 -d-b = (/ + 1)2-/ 

Note that d; 7 ^ 0 for any I since the constant solution of (29), Xt = 0, is ruled out by the initial 
condition xq = 1. Moreover, di > /3(1 — /3)(i;_i, so that d; > 0 for any I, cfr. (28). In particular, 
di > /3^-^(l — /3)^-^, I > 2 . Since for /3 7 ^ 0,1 the absolute value of the roots of the characteristic 
equation is < 1 , the constant solution = 0 is stable, meaning that xt ^ 0 as t —)■ 00 , i.e. 
lim;_^oo di = 0. When /3 = 0,1, (29) has constant solution xt = 1, hence d; = 1 for any 1. 

Since now dL ^ 0, the solution is unique and given by 

9 = A~^I3 = j3A~^ei\ + (1 — l3)A~^eLijL 


and, in particular, 


(32) 


6i =/3aiiX + {I -/3)aiLix, i = 


where aij is the (*j)th entry of ^ By using Cramer’s method. 


where Aij is the (L — 1) x (L — 1) matrix obtained by deletion of the rth row and jth column 
of A. Hence, 

an = d^^(-l)^+Met^ii, aiL = d^^(-l)^+* det^Li 


Consider a^L first. It is easy to see that An is lower triangular with — (1 — /3) in the diagonal, 
hence det(j4£,i) = [—(1 — /3)]^-^. For 1 < z < L, An is a (L — 1) x (L — 1) matrix that can be 
partitioned as 


An 


Bn Bi2 
B21 B22 


where Bn correspond to A with i — 1 rows, B 12 is a (i — 1) x (L — i) made of all zero and 
B 22 is a lower triangular matrix with L — i rows and — (1 — /3) in the diagonal. In particular. 
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det(i?ii) = di-I and det(il 22 ) = [“(l“/3)]^ By using the formula for the determinant of 
partitioned matrices, 


we find that 


det 


Bu Bi2 
B 21 B 22 


det(Bii) det(B 22 - B 2 iB^/Bi 2 ) 


det= det(Bii)det(B 22 ) = *(1-/3)^ * 


so that 


aiL = *( 1 -/ 3 )^ " = dj^^{-lf^di_i{l -/3)^ * = -/ 3 )^ * 


Finally, All corresponds to A with L — 1 rows, hence det{ALL) = Summing up, by using 

the convention do = 1, 

(33) OiL = dl^{l - ld)^~''di-i, i = l,...,L 

As for Oil, we can exploit a certain symmetry of A: using the notation A = Ajs , = Ai_^, 

so that A'^^ = {A'^^^y to find that 

(34) an = dyp^~^dL-i, i = l,...,L. 


Plugging in (33) and (34) into (32), one obtains 

^ ydL-r, , (l-/3)^-*+ld,_i 

Oi = —3-A H---/X 

dL dL 

and the thesis follows by using (30) and (31). By direct calculation, one can check that 9i is 
a convex linear combination of A and fi. Alternatively, one can use directly equation (32). Let 
1 = ei + .. . + eL = (1,..., 1). Since Al = (/3,0,... ,0, (1 —/3)), we have j3A~^ei-\-{l — j3)A~^eL = 
1, that is (dan + (1 — /3)ajL = 1 for any 1 < i < L. Finally, jdan, (1 — l3)aiL > 0 as a simple 
calculation reveals. This completes the proof. 

As for (27), the result for /3 = 1/2 is straightforward. When /3 7 ^ 1/2, it is convenient to write 


ey = 


(1-/3)^+'(t^)'-/3^+\ (1-/3)^+' 

■A + 


(1 - /3)^+i - /3^+i (1 - /3)^+i - /3^+i 


1 - 


/3 


1-/3 




We have 


1 


L 

'S^aW = _ 

^ * (1 - /3)^+i - /3-^+i 


(1 - /3)^+i ^ 


/3 


fervi-/3 


-L/3 


L+l 
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+ 


L(1 _/3)i+i _ (1 _ ;3)^+i ^ _ 


2=1 


/3 


-/3 




(1 -/3)^+V-;5^+^A 

(1 - /3)^+i - /3^+i ^ ' (1-/3) 


L + 


(1-/3) 


L+l 


_ «R+l _ 




rE T 


2=1 


/3 


-/3 


(A-/i) 


The result follows by the formula of the sum of the first L terms of a geometric series, 

V ^ V = /3 (l-/3)^-/3^ 

1-2/3 (1-/3)^ ■ 


The simplification (11) can be derived from (26) by using = {x — y){x + y). □ 

Lemma A.2. Let 9^ be defined as in (26) with y = yi and {yL}L>i a decreasing sequence of 
positive real numbers such that /r^—>-0 osL—^oo. Then 9\^'^ > for L large enough and 
any i = 1,..., L. 


Proof. Consider the case of /3 1/2 and, as a short hand notation, let p = /3/(l — /3) and 

c = [(1 — /3)'^+^ — /3'^+^][(l — — fi^]. We have 

(L) _ (L- 1 ) ^ f (l-/3)^+V-/3^+^ _ (1-/3)V-/3^ 1 

i * \ (1 _/3)L+1 _ ^L+1 J 

+ - (1 

= (l-/3)''/3^ {-(l-/3)y-/3 + l-/3 + /3p-}A 

c 

+ i {(1 - /3)2^+i(/rz. - yL-i) - (1 - /3)"'/3^ [(1 - P)yL - fihL-i]} (1 - f) 

= 1111^1^(1 - 2/3)(1 -p*)A + - |(1 - /3)2^+i(/iL - TL-i) 
c c [ 


-(l-/3)^/3^(l-2/3) 

,L^L 

c 


hL + 


/3 


1-2/3 


[TL — hL-l 


il-p') 


(1 /3)^/3^(l p^) ^ ^ 


Note that c > 0 for any /3 7 ^ 1 / 2 . When /3 < 1 / 2 , 1 — p* > 0 , (1 — 2/3) > 0 and X > yL for 
L large since 0. As for the second term [(1 — /3)p~^ — /3] {yL — hL-i) in curly brackets, 

(1 — (5)p~^ — /3 t 00 and yi — yi-i > 0 as L gets large when yi is decreasing. When /3 > 1/2, 
1 — p* < 0 , (1 — 2/3) < 0 and X> yi for L large since yi —t 0 . Also (1 — fi)p~^ — fi i —fi nnd 
Ll — Tl -1 > 0 given the monotonicity of {^l}l>i. So also in this case 9^ — A — *!• When 
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/3 = 1/2, 



,U-.) 


L — z + 1 


L + 1 
1 


L(L + 1 ) 


z(A 


L - i I ^ f i i I 

- /iL-i) + iL{frL - h^L-i)] > 0 


for L large since A > hl-i and pi — h-L-i > 0. 


□ 


Proposition A.3. Let Ml have transitions (15), and Ml as in Definition (6) with fi = fiL- 
For any L < oo, Ml is embedded in Ml at the Markov times Ok in the sense that the times 
{crfc,n > 1} occur infinitely often in N and 





PL 


O'fc-l 



PL{m'\m). 


for all k and m, m' G . 


Proof. Let Lj (resp. Ri) denote the event that the zth transition after ak-i involves compo¬ 
nents mi,... ,mL (resp. mL+i,mL+ 2 , ■ ■ •)• Denote also by Pfc_i_m(‘) the conditional probability 
P(-|M(crfc-i) = m) and Efc_i^m(‘) for the respective conditional expectation. Given m G Z“, we 
have 


Pfc—l,m(<7fc — ^k—1 “1“ f) — Pfc—l,m(Ll) — 


VL + Y.i=l 


PL + \m\ - (1 - fi){L l)mL+i 


where = /3A -|- (1 — fi)yLL and, for h> 2, 

IP'fc—l,m( ^k—1 h') P/j_i j7j(L/j |.Ri, . . . , Rfi—l) Pfc—l,m(.^t |.^11 • • • ) . 


h-1 


e=i 


The denominator of Fk-i^m{Lh\Ri, ■ ■ ■, Rh-i) depends on the intermediate transitions and is 
thus random. We can factorise 

H H 

1 - ^PA:-l,m(o'fc = CTk-l + h) = [1 — Pfc-l,m(L/i|i?i, . . . 

h=l h=l 


and the set {un, n > 1} has an infinite number of terms as long as 


^ ^ ^k—l,m{d-'h\dil ) ■ ■ ■ ) Rh—l} t OO. 

h>l 


Fk-l,m{Lh\Rl, • • • , Rh-l) > 


VL + Y^i=i imj 
Vl + l^-l -|- h — 1 


The latter holds since 
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Since now the transition at step h is the first to involve the hrst L components, whose configu¬ 
ration has not changed from pL{M{ak-i)), the numerator of the respective probability can be 
isolated to write 


= CTk-i + h) 

L \ 

VL + • • • , Rh-l) '^k-l,m{Rt\Rl-, • • • , Rt-l) 

^ £=1 

where 

Th = rjL -f + h-l) - I3{L + l)ML+i{(7k-i + h - 1) 

i>l 

denotes the denominator in ¥k-i^miLh) and Mi(h) the random variable for the fth component 
of M{h). A similar derivation for the transition m — Cj -)- Cj+i, f = 1 , ..., L, occurring at step h 
after ak-i, leads to writing 



((Tfc_i -f h)) = pL{m -ei + ei+i),ak = ak-i + h) = 

h-l 

= (3imi 'Ek-i^rn{Tf^^\Ri, ■ ■ ■ ,Rh-i) JJ ^k-i,m{Re\Ri, ■ ■ ■ ,Re-i), 

£=l 

from which, in turn, 

F{pL{M{ak)) = PL{m -ei + Cj+i) | M{ak-i) = m,ak = au-i + h) = --. 

^ + Li=i 

An analogous statement can be derived with a similar argument for all other transitions involving 
one of the first L components, which, in view of the independence on h of the right hand side, 
leads to the result. □ 
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